Goto

Collaborating Authors

 voice deepfake


From Real to Cloned Singer Identification

Desblancs, Dorian, Meseguer-Brocal, Gabriel, Hennequin, Romain, Moussallam, Manuel

arXiv.org Artificial Intelligence

Cloned voices of popular singers sound increasingly realistic and have gained popularity over the past few years. They however pose a threat to the industry due to personality rights concerns. As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. We present three embedding models that are trained using a singer-level contrastive learning scheme, where positive pairs consist of segments with vocals from the same singers. These segments can be mixtures for the first model, vocals for the second, and both for the third. We demonstrate that all three models are highly capable of identifying real singers. However, their performance deteriorates when classifying cloned versions of singers in our evaluation set. This is especially true for models that use mixtures as an input. These findings highlight the need to understand the biases that exist within singer identification systems, and how they can influence the identification of voice deepfakes in music.


AI-Generated Voice Deep Fakes Aren't Scary Good--Yet

WIRED

There have been a couple of high-profile incidents in recent years in which cybercriminals have reportedly used voice deepfakes of company CEOs in attempts to steal large amounts of money--not to mention that documentarians posthumously created voice deepfakes of Anthony Bourdain. But are criminals at the turning point where any given spam call could contain your sibling's cloned voice desperately seeking "bail money?" No, researchers say--at least not yet. The technology to create convincing, robust voice deepfakes is powerful and increasingly prevalent in controlled settings or situations where extensive recordings of a person's voice are available. At the end of February, Motherboard reporter Joseph Cox published findings that he had recorded five minutes of himself talking and then used a publicly available generative AI service, ElevenLabs, to create voice deepfakes that defeated a bank's voice-authentication system.


Voice deepfakes are getting easier to spot

#artificialintelligence

New research has shown that voice deepfakes are becoming easier to spot as synthetic recreations of real voices, thanks to the anatomy of our vocal tracts. Researchers at the University of Florida have devised a method of simulating images of a human vocal tract's apparent movements (opens in new tab) while a voice clip - real or fake - is played back. Professor of Computer and Information Science and Engineering Patrick Traynor and PhD student Logan Blue wrote that they and their colleagues found that simulations prompted by voice deepfakes weren't constrained by "the same anatomical limitations humans have", with some vocal tract measurements having "the same relative diameter and consistency as a drinking straw". Though scientists are starting to spot voice deepfakes with simulation and anatomical comparison, the risk of an ordinary person being tricked by any deepfake - which could lead to identity theft - remains a problem. Ordinary people don't yet have access to these tools.


Everyone will be able to clone their voice in the future

#artificialintelligence

Cloning your voice using artificial intelligence is simultaneously tedious and simple: hallmarks of a technology that's just about mature and ready to go public. All you need to do is talk into a microphone for 30 minutes or so, reading a script as carefully as you can (in my case: the voiceover from a David Attenborough documentary). After starting and stopping dozens of times to re-record your flubs and mumbles, you'll send off the resulting audio files to be processed and, in a few hours' time, be told that a copy of your voice is ready and waiting. Then, you can type anything you want into a chatbox, and your AI clone will say it back to you, with the resulting audio realistic to fool even friends and family -- at least for a few moments. The fact that such a service even exists may be news to many, and I don't believe we've begun to fully consider the impact easy access to this technology will have.


Thieves Reportedly Used Voice Deepfake of a CEO to Steal $240,000

#artificialintelligence

Thieves used voice-mimicking software to imitate a company executive's speech and dupe his subordinate into sending hundreds of thousands of dollars to a secret account, the company's insurer said, in a remarkable case that some researchers are calling one of the world's first publicly reported artificial-intelligence heists. The managing director of a British energy company, believing his boss was on the phone, followed orders one Friday afternoon in March to wire more than $240,000 (roughly Rs. 1.7 crores) to an account in Hungary, said representatives from the French insurance giant Euler Hermes, which declined to name the company. The request was "rather strange," the director noted later in an email, but the voice was so lifelike that he felt he had no choice but to comply. The insurer, whose case was first reported by the Wall Street Journal, provided new details on the theft to The Washington Post on Wednesday, including an email from the employee tricked by what the insurer is referring to internally as "the false Johannes." Now being developed by a wide range of Silicon Valley titans and AI startups, such voice-synthesis software can copy the rhythms and intonations of a person's voice and be used to produce convincing speech.


A Voice Deepfake Was Used To Scam A CEO Out Of $243,000

#artificialintelligence

Anonymous hacker programmer uses a laptop to hack the system in the dark. Phone scams are nothing new, but the mark usually isn't an accomplished CEO. According to a new report in The Wall Street Journal, the CEO of an unnamed UK-based energy firm believed he was on the phone with his boss, the chief executive of firm's the German parent company, when he followed the orders to immediately transfer €220,000 (approx. In fact, the voice belonged to a fraudster using AI voice technology to spoof the German chief executive. Rüdiger Kirsch of Euler Hermes Group SA, the firm's insurance company, shared the information with WSJ. He explained that the CEO recognized the subtle German accent in his boss's voice--and moreover that it carried the man's "melody."